COVID-19 is causing havoc in Oregon once again, and as numbers continue to spike, I decided to revisit one the of the first projects I worked on in R Studio. That project can be seen here, and used data from Johns Hopkins. However, because that data is no longer updated this investigation will use data from the NY times that has more current data. The repository for the NY Times data can be found here, and the datasets that are being included are :
us.states : state level data (file description here)
us.counties : county-level data (file description here)
colleges : number of reported cases among students and employees at American colleges and universities, updated May 26th (file description here)
mask_use : survey between July 2 and July 14 (2020) where participants were asked, “How oftern do you wear a mask in public when you expect to be within six feet of another person?” (file description here)
vacc: state level COVID-19 daily vaccination numbers time series data from the Johns Hopkins University repository (file description here)
policytrackerOR: dates and description of policies going into/out of effect in Oregon. To load data for a particular state go to here, find the name of the state file you want to work with.
Here is the data :
us.states <- read_csv('https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv')
us.counties <- read_csv('https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv')
colleges <- read_csv('https://raw.githubusercontent.com/nytimes/covid-19-data/master/colleges/colleges.csv')
mask_use <- read_csv('https://raw.githubusercontent.com/nytimes/covid-19-data/master/colleges/colleges.csv')
vacc <- read_csv("https://raw.githubusercontent.com/govex/COVID-19/master/data_tables/vaccine_data/us_data/time_series/people_vaccinated_us_timeline.csv")
policytrackerOR <- read_csv("https://raw.githubusercontent.com/govex/COVID-19/govex_data/data_tables/policy_data/table_data/Current/Oregon_policy.csv")
The projected cited above mainly looked at the case, death, and vaccination numbers per state to compare highly and mildly impacted states. In this project I will look at highly impacted states and the counties of Oregon. Additionally this project will use population and density data from the tidycensus package. I discuss more about how I got this data using an API in a blog post here. Note that this data is from 2019, which is a couple years older than the COVID data.
# tidycensus
# State : POP and DENSITY data
state.pop <- get_estimates(geography = "state", year = 2019, variable = "POP") %>% rename ("state" = NAME, "population" = value)
state.den <- get_estimates(geography = "state", year = 2019, variable = "DENSITY") %>% rename ("state" = NAME, "density" = value)
# OREGON : POP and Density data
or.county.pop <- get_estimates(geography = "county", state = "OR", year = 2019, variable = "POP") %>% rename ("county" = NAME, "population" = value)
or.county.den <- get_estimates(geography = "county", state = "OR", year = 2019, variable = "DENSITY") %>% rename ("county" = NAME, "density" = value)
The COVID data set is already in long form (meaning the dates are in rows instead of columns), and the date is already saved as a date variable. Therefore the main tasks here are to join the us.states data set with the vaccination records, and population estimates. Then join the us.states.vacc with the population data and create new percentage columns.
## # A tibble: 5 × 12
## date state population density cases case.per deaths perc.deaths
## <date> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2020-01-21 Washington 7614893 115. 1 0.000000131 0 0
## 2 2020-01-22 Washington 7614893 115. 1 0.000000131 0 0
## 3 2020-01-23 Washington 7614893 115. 1 0.000000131 0 0
## 4 2020-01-24 Illinois 12671821 228. 1 0.0000000789 0 0
## 5 2020-01-24 Washington 7614893 115. 1 0.000000131 0 0
## # … with 4 more variables: full.vacc <dbl>, full.vacc.perc <dbl>,
## # part.vacc <dbl>, part.vacc.perc <dbl>
Note: People_Fully_Vaccinated and People_Partially_Vaccinated show as NA because of the dates displayed were before the vaccine was released.
Next, to join the Oregon county population data with the density data, remove “County, Oregon” from each county object, filter out Oregon from the us.counties data, and then join with the population data per county.
## # A tibble: 10 × 8
## date county population density cases cases.perc deaths deaths.perc
## <date> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2020-02-28 Washington 601592 831. 1 0.00000166 0 0
## 2 2020-02-29 Washington 601592 831. 1 0.00000166 0 0
## 3 2020-03-01 Washington 601592 831. 2 0.00000332 0 0
## 4 2020-03-02 Washington 601592 831. 2 0.00000332 0 0
## 5 2020-03-03 Washington 601592 831. 2 0.00000332 0 0
## 6 2020-03-04 Washington 601592 831. 2 0.00000332 0 0
## 7 2020-03-05 Washington 601592 831. 2 0.00000332 0 0
## 8 2020-03-06 Washington 601592 831. 2 0.00000332 0 0
## 9 2020-03-07 Jackson 220944 79.4 2 0.00000905 0 0
## 10 2020-03-07 Klamath 68238 11.5 1 0.0000147 0 0
To begin lets look at the country as a whole, by state. The data will be filtered for 2022-01-28, and then lets look at the top five states with :
| Most Cases | |||
|---|---|---|---|
| By State as of 2022-01-28 | |||
| State | Total Population | Cases | Percentage |
| California | 39,512,223 | 8,248,681 | 20.88% |
| Texas | 28,995,881 | 6,122,432 | 21.11% |
| Florida | 21,477,737 | 5,478,894 | 25.51% |
| New York | 19,453,561 | 4,761,590 | 24.48% |
| Illinois | 12,671,821 | 2,901,790 | 22.90% |
| Highest Percent of Cases | |||
|---|---|---|---|
| By State as of 2022-01-28 | |||
| State | Total Population | Cases | Percentage |
| Rhode Island | 1,059,361 | 341,407 | 32.23% |
| North Dakota | 762,062 | 221,072 | 29.01% |
| Alaska | 731,545 | 211,117 | 28.86% |
| Utah | 3,205,958 | 875,414 | 27.31% |
| South Carolina | 5,148,714 | 1,355,116 | 26.32% |
| Tennessee | 6,829,174 | 1,785,235 | 26.14% |
| Wisconsin | 5,822,434 | 1,503,538 | 25.82% |
| Kentucky | 4,467,673 | 1,152,288 | 25.79% |
| Florida | 21,477,737 | 5,478,894 | 25.51% |
| South Dakota | 884,659 | 225,383 | 25.48% |
| Most Deaths | |||
|---|---|---|---|
| By State as of 2022-01-28 | |||
| State | Total Population | Deaths | Percentage |
| California | 39,512,223 | 79,934 | 0.20% |
| Texas | 28,995,881 | 79,324 | 0.27% |
| Florida | 21,477,737 | 64,955 | 0.30% |
| New York | 19,453,561 | 63,910 | 0.33% |
| Pennsylvania | 12,801,989 | 40,394 | 0.32% |
| Highest percent of Deaths | |||
|---|---|---|---|
| By State as of 2022-01-28 | |||
| State | Total Population | Deaths | Percentage |
| Mississippi | 2,976,149 | 10,831 | 0.36% |
| Arizona | 7,278,717 | 26,001 | 0.36% |
| New Jersey | 8,882,190 | 31,320 | 0.35% |
| Alabama | 4,903,185 | 16,826 | 0.34% |
| Louisiana | 4,648,794 | 15,631 | 0.34% |
| Most People Vaccinated | |||
|---|---|---|---|
| By State as of 2022-01-28 | |||
| State | Total Population | People Fully Vaccinated | Percentage |
| California | 39,512,223 | 27,476,629 | 69.54% |
| Texas | 28,995,881 | 17,029,057 | 58.73% |
| New York | 19,453,561 | 14,389,193 | 73.97% |
| Florida | 21,477,737 | 13,953,387 | 64.97% |
| Pennsylvania | 12,801,989 | 8,401,030 | 65.62% |
| Highest Percent of People Fully Vaccinated | |||
|---|---|---|---|
| By State as of 2022-01-28 | |||
| State | Total Population | People Fully Vaccinated | Percentage |
| District of Columbia | 705,749 | 614,182 | 87.03% |
| Puerto Rico | 3,193,694 | 2,547,802 | 79.78% |
| Vermont | 623,989 | 494,648 | 79.27% |
| Rhode Island | 1,059,361 | 834,592 | 78.78% |
| Maine | 1,344,212 | 1,041,199 | 77.46% |
Next to look at the data a little closer to home, for Oregon Counties. Initially filtering by the most recent date, which as of this being written is 2022-01-28`, looking at a graph of the state as a whole, and then look at the top five Oregon counties.
| Most Cases | |||
|---|---|---|---|
| Oregon Counties as of 2022-01-28 | |||
| County | Total Population | Cases | Percentage |
| Multnomah | 812,855 | 99,887 | 12.29% |
| Washington | 601,592 | 74,169 | 12.33% |
| Marion | 347,818 | 60,091 | 17.28% |
| Clackamas | 418,187 | 52,944 | 12.66% |
| Lane | 382,067 | 48,058 | 12.58% |
| Highest Percentage of Cases | |||
|---|---|---|---|
| Oregon Counties as of 2022-01-28 | |||
| County | Total Population | Cases | Percentage |
| Umatilla | 77,950 | 20,623 | 26.46% |
| Jefferson | 24,658 | 6,348 | 25.74% |
| Malheur | 30,571 | 7,501 | 24.54% |
| Morrow | 11,603 | 2,781 | 23.97% |
| Crook | 24,404 | 5,145 | 21.08% |
| Most Deaths | |||
|---|---|---|---|
| Oregon Counties as of 2022-01-28 | |||
| County | Total Population | Deaths | Percentage |
| Multnomah | 812,855 | 960 | 0.12% |
| Marion | 347,818 | 581 | 0.17% |
| Clackamas | 418,187 | 479 | 0.11% |
| Washington | 601,592 | 472 | 0.08% |
| Jackson | 220,944 | 430 | 0.19% |
| Highest Percent of Deaths | |||
|---|---|---|---|
| Oregon Counties as of 2022-01-28 | |||
| County | Total Population | Deaths | Percentage |
| Harney | 7,393 | 35 | 0.47% |
| Josephine | 87,487 | 296 | 0.34% |
| Malheur | 30,571 | 97 | 0.32% |
| Douglas | 110,980 | 331 | 0.30% |
| Lake | 7,869 | 23 | 0.29% |